Determining structures of individual RNA conformers using atomic force microscopy images and deep neural networks

The vast percentage of the human genome is transcribed into RNA, many of which contain various structural elements and are important for functions. RNA molecules are conformationally heterogeneous and functionally dyanmics1, even when they are structured and well-folded2, which limit the applicability of methods such as NMR, crystallography, or cryo-EM. Moreover, because of the lack of a large structure RNA database, and no clear correlation between sequence and structure, approaches like AlphaFold3 for protein structure prediction, do not apply to RNA. Therefore determining the structures of heterogeneous RNA is an unmet challenge. Here we report a novel method of determining RNA three-dimensional topological structures using deep neural networks and atomic force microscopy (AFM) images of individual RNA molecules in solution. Owing to the high signal-to-noise ratio of AFM, our method is ideal for capturing structures of individual conformationally heterogeneous RNA. We show that our method can determine 3D topological structures of any large folded RNA conformers, from ~ 200 to ~ 420 residues, the size range that most functional RNA structures or structural elements fall into. Thus our method addresses one of the major challenges in frontier RNA structural biology and may impact our fundamental understanding of RNA structure.


Main
Knowledge about RNA structure and dynamics is important for understanding functions 1,4,5 , designing novel devices 6 and developing RNA-targeting compounds 7 . Since the first threedimensional (3D) structures of tRNA were determined almost a half-century ago 8, 9 , there has been a slow increase in the number of RNA structures determined by NMR, X-ray crystallography, and 5 most recently, cryo-EM 10 , the backbone technologies for high-resolution structure determination.
All these methods rely on signal-averaging and thus are limited to conformationally homogeneous samples. Since RNA molecules are dynamic and conformationally heterogeneous shown by direct visualization under physiological conditions 2 , even for RNA as small as the HIV TAR RNA 11 , a single snapshot structure falls short of accurately describing the conformational landscape 12 . This 10 limitation explains in part both the scarcity and redundancy of RNA structures in structural databases 10,12 . Thus, given the rapid progress in RNA research and widespread applications in the biomedical and public health sectors, developing a method for studying the highly heterogeneous conformational space of RNA becomes ever-important and urgent. 15 AFM topographic images provide direct global structural information with a high signal-to-noise ratio (S/N) enabling visualization of individual molecules at a resolution sufficient to discern duplex helical grooves [13][14][15] . The usefulness of global structural information in restraining RNA structures was previously implicated 16,17 . However, the correlation between a topographic AFM image and the underlying atomistic topological structure, and its use for recapitulation of 20 individual RNA conformer structures have not been explored. Importantly, AFM images of individual particles represent individual conformers of single molecules, and thus could in principle make it possible to study the conformations of individual molecules without signal-averaging. This utility is akin to other single-molecule methods such as the widely used singlemolecule Förster resonance energy transfer (smFRET) 18 , which has deconvoluted heterogeneous populations in numerous protein and RNA systems but is limited to providing sparse spatial information as only the conjugated fluorophores are visible. 5 In this report, we present HORNET (Holistic RNA structure determination method using AFM, unsupervised machine learning (ML), and deep neural Networks (DNN)), a novel approach for determining the individual three-dimensional (3D) topological structures of heterogeneous RNA conformers based on AFM data (Fig. 1). Our method drives the conformational trajectory toward a convergence that satisfies both the weighted AFM pseudo-potentials and classical Gibbs free- 10 energy descriptions 19,20 (Fig. 1a; see Methods). The top structures are then selected and evaluated by both unsupervised and deep learning using a holistic consideration of all energetic and topographic information (Fig. 1b). Furthermore, we used deep neural networks (DNN) trained by a pseudo-structure database to provide an accuracy estimation of top structures in terms of rootmean-square-deviation (RMSD). 15

Images of individual RNA conformers in solution
We use RNase P RNA from Bacillus stearothermophilus as an example to illustrate the utility of AFM ( Fig. 2a-e, and Extended Data Fig. 3). These three topographic images (Fig. 2a) capture three individual RNA molecules (P1, P2, and P3) in three different conformational states, none of 20 which is identical to the crystal structure (Fig. 2c). The background noise of the particle topography is between 1-5% of the maximum height (Extended Data Fig. 4).  Tables 1 and 2) and DNN (II) (analysis details is described in the Methods, 5 Extended Data Fig. 2 and Extended Data Table 3). analysis of the AFM-topography, the auto-correlation value (ACV) of the AFM image of the particle. The spatial regimes with an abrupt decrease of ACV are shown in a zoom-in panel inset, where the kinks of ACV values indicate structure features present in the image at that spatial position, and the clear kinks for P1, P2 and P3 are highlighted by the dotted lines at ~0.87 nm -1 (11.5Å), 0.87 nm -1 (11.5Å), and 0.87 nm -1 (11.5Å), respectively. e, The first derivative of the ACV 15 plots to observe the full-profile variation. The gray areas indicate ACV discontinuity regimes, with the maximum spatial threshold at ~0.34 nm -1 (29Å) and a minimum of ~ 0.87 nm -1 (11.5Å), ~1.4 nm -1 (7.2Å), ~0.97 nm -1 (10.3Å), for P1, P2 and P3, respectively.
The resolution of the P1, P2 and P3 images was assessed by applying the low-pass Fourier filter 21 (Extended Data Fig. 5) where clear spatial information at 0.87nm -1 (11.5 Å), 0.90nm -1 (11.1 Å), and 0.80nm -1 (12.5 Å) for P1, P2, and P3, respectively ( Fig. 2d-e). The low resolution would seem to limit its use in structure determination. However, the RNA structural characteristics make AFM an ideal technique for the following reasons. RNA structure folding is hierarchical 22 and modular. 5 Highly conserved A-form duplexes, which count for more than 70% of the mass in RNA structures in structural databases, vary within ~1.5 Å in terms of the backbone RMSD 23 . Thus, in principle, given an initial structure as prior knowledge, 3D topological structures can be recapitulated from AFM molecular surfaces using a dynamic fitting with uncertainty significantly less than the inherent resolution limits of the AFM data itself, an approach similar to that used in X-ray 10 crystallography and cryo-EM 24 .
No structures of RNA have been determined using information solely from an individual macromolecule, as all reported structures were determined using signal-averaging methods. Thus, we first demonstrated our method by using a set of simulated data. Excluding structures of 15 ribosomes and other protein-RNA complexes, there are only four classes of RNA larger than 210 nucleotides (nt) with resolutions better than 3.5 Å. They are the holo adenosylcobalamin riboswitch, Group I and Group II introns, and RNase P RNA. RNase P is a multi-turnover ribozyme that processes the 5'-ends of pre-tRNA and other RNAs [25][26][27] . Its substrate promiscuity suggests conformational flexibility required for substrate binding and recognition 25 . Studies using 20 chemical probing and SAXS show that such large conformational changes, which can be as large as 30 Å 25,28-30 , occur via the motions of individual helical structural elements without disruption of base-pairing interactions 25,29 . We used the catalytic core domain of RNase P (PDB ID: 3DHS) 8 31 (see Methods) to generate the initial simulated data, called Benchmark 0 (BM0, Extended Data Fig. 6), at seven different levels of imposed Gaussian noise, 5,10,15,20,30,40 and 50% of the maximum Z-height of the AFM topographic image (Extended Data Fig. 7). An initial structure was best fit to the AFM molecular surfaces via classical Langevin coarse-grained (CG) molecular dynamics 32 for efficient sampling over broad conformational space to search for the best fit to an 5 experimental AFM image.
The impact of the topographic information from AFM images is illustrated by the residue dynamic cross-correlation maps (DCCM), where the AFM-restrained DCCM is only a subset of the free (unrestrained) DCCM (Extended Data Fig. 8a). Our method recapitulates 3D topological 10 structures of RNase P conformers with the lowest RMSDs ranging from 2.97 to 6.04 Å relative to the ground-truth (GT) structure in the top cohort, depending on the applied noise level (Extended Data Fig. 8b). Within the practical limit of computing power, the efficiency of recapitulation depends heavily on initial input structure models. We tested HORNET on five additional types of RNA with very different topological structures derived using various methods (Fig. 3): RNA S142 15 (BM1) and 1076 (BM2) (Extended Data Fig. 9 and 10, and Extended Data Table 4), which were generated using FARFAR2 33 ; an AFM-derived structure of the adenosylcobalamin riboswitch (BM3) 34 (Extended Data Fig. 11), and topological structures of group II intron (BM4) 35 and RNase P (BM5) 36 (Extended Data Fig. 12), both of which were derived from small angle X-ray scattering data 37 (Fig. 3a). These represent the majority of classes of RNA larger than 210 20 residues in the PDB database.

Identifying top structural models using unsupervised ML
Although top structures could be recapitulated together with vast numbers of other conformers, it is challenging to identify the top structures that are representative of the "true" structures underneath the AFM topographic surface using simple conventional correlation or approaches.
The cross-correlation score (CC AFM ) between a molecular surface and a structure 36 alone is neither insufficient nor designed to identify top structure models because a near-perfect CC AFM could be 5 achieved at the expense of structural integrity and the hierarchical folding principle. We used a holistic unsupervised ML by considering a combination of three types of information as input (  (Fig. 3a), and a successive clusterization algorithm that 15 identifies the cluster of models with the lowest energetic distribution (Fig. 3b) according to the Go and total energies 32 (see Methods). The Go potential contains information about how well a given model folds. Importantly, the use of the Go energy does not prevent the AFM-biasing pseudo potential from sampling structures that are dramatically different from the initial structure. In the final step, the top cohort of models that exhibit both the lowest energies (Go, local, and total), as 20 well as the best agreement with CC AFM , are selected from the cluster (Fig. 3d, see Methods). Each step of the unsupervised ML pipeline iteratively selects the sub-population of the lowest RMSD models at each noise level ( Fig. 3e and Extended Data Fig. 18). In all cases at the tested noise  where the cluster color in magenta represents the selected representative cluster. c, Cohort selection applying energy and topology threshold to select models (gray) from the selected cluster (magenta). d, the post-unsupervised ML analysis of the progression of the selection process in terms of RMSD from the GT structures in each benchmark case, from the total trajectory population (orange), through the energy filter (blue), and clustering (magenta) to the final top 10 cohorts (grey). The vertical axis indicates the density of populations. e, The top structures rendered in color in terms of RMSF compared to the GT structures. levels, almost all the top models fell within the final cohort, and the top 10 lowest-total-energy models from the unsupervised ML cohort had an average RMSD of 5 Å relative to the GT structure, with the lowest RMSD of ~3.5 Å. Interestingly, even at the higher noise levels, the average RMSD of the top 10 selected models was ~5 Å (Extended Data Table 5 and Extended Data Fig. 18), illustrating the robustness of HORNET.

Estimating model accuracy using supervised ML
While unsupervised ML is capable of selecting top cohorts, it does not provide an estimate of accuracy, which is critical for the determination of unknown structures. Since the information about the accuracy of a conformer structure is contained within structure models as well as AFM topographic images, in principle, a well-trained DNN is capable of providing accuracy estimation, 10 given the latest success in protein structure prediction 3 . However, using DNN to estimate the accuracy of RNA structure models is far more challenging than for proteins 16 . The DNN training models used for protein structure prediction 3 leverage the abundant information available in databases and the sequence-3D structure correlation and thus are not applicable to RNA because of the lack of both 38 . Geometric deep learning has some success in structure prediction and 15 accuracy estimates for relatively small RNAs but fails for large RNAs because of the lack of global restraints 16 and geometric equivalence among conformers, and thus is not applicable to predicting heterogeneous structures of the same sequence.
To overcome this problem, we created a pseudo-structure database containing more than three 20 million conformer structure models, ~1.5 million trajectory structural models from BM0 and ~1 million each from BM1 and BM2 (see Methods). We assume this pseudo structure database to cover a broad conformational space between initial and final structure models that differ by as large as 35 Å in terms of RMSD. This input data taken from the pseudo database for the DNN training was partitioned as 80 % for training the model (training set) and 20 % for initial validation and testing of underfitting and overfitting in regularization processes (Extended Data Table 3).
Aiming to assess if the learning by DNN is generalizable for not only the structural models from different trajectories but also from completely different RNAs 39-41 , we use 5% of the dataset from 5 BM5 as our validation dataset and BM3 and BM4 as our blind tests. Such a generalization scheme is shown to have a more realistic and robust assessment of the performance than that including fractions of data from all benchmark datasets 40 . To assess the convergence of the trained models, we evaluated the progression of the loss function by the training epoch and selected the trained model and epoch that have the smallest loss in the validation set ( Fig. 4a-b). Importantly, the input 10 of our algorithm does not utilize information from structural databases. The DNN showed high performance for the predicted models in the initial training test set, with a Pearson coefficient of 0.97 (Fig. 4a). To evaluate the performance of HORNET for different trajectories and RNAs with different shapes, sizes, and sequences, we then tested the full trajectory of the five benchmarks to illustrate that our holistic DNN is learning generalizable structural "metrics" of accuracy, not 15 merely memorizing specific structural features ( Fig. 4c-g). Critically, all the benchmark tests used only data that the holistic DNN had never seen in either the training or initial validation. Our SML method ( Fig. 1b-II) corroborates and cross-validates the results using unsupervised ML ( Fig. 4h; Extended Data Table 6), while also estimating the accuracy of top cohorts in terms of RMSD. The tests in the five benchmark cases indicate that HORNET can recapitulate topological structure 20 models from AFM images with an accuracy better than 6Å RMSD.

Validation using independent datasets
We evaluated a scenario where the initial structures were generated using FARFAR2 33 and chosen based on their highest ARES 16 scores 9.23 and 9.04 for BM1 and BM2, respectively. The RMSDs between the GT structure and BM1 and BM2 are 14.3 Å and 20.0 Å, respectively and between the S142 and 1076 is ~18.7 Å, indicating that they adopt dramatically different topologies from each other and from the GT structure ( Fig. 3a and Extended Data Fig. 10). In each of these cases, both 5 unsupervised ML and DNN were capable of identifying the top cohorts ( Fig. 3d and Fig. 4b,c,g). DNN performed well in the structure predictions, with Pearson correlations of 0.80 and 0.89, respectively (Fig. 4b,c). Furthermore, a poor RMSD predicted by HORNET is indicative of the non-convergence of a structure calculation. Structure S257, which has the best ARES score of 7.72 among 8000 conformers generated by FARFAR2 did not converge through dynamic fitting 10 within practical computing time based on the poor predicted RMSD (Extended Data Fig. 19). Thus, S257 serves as an extreme case for testing our method, where the HORNERT was able to evaluate the non-convergency of the structure fitting (Extended Data Fig. 20) a primordial step in a real scenario where the GT structure is unknown and non-available. After a molecular dynamics simulation without AFM restraints starting from S257, we applied the unsupervised ML method

Validation using data from a smaller RNA
The 210-nt cobalamin-sensing riboswitch (rCbl, BM3) is capable of folding into heterogeneous conformations 2 . This RNA has a completely different sequence, shape, size, and folding from  other RNAs used for training and validation in this study. Thus, BM3 serves as a unique benchmark for further testing the capability of HORNET. The crystal structure of BM3 was used 42 as the GT model and the initial structure is one of the conformers revealed by AFM 34 , which has an RMSD of 10.2 Å from the GT structure. Our unsupervised ML was capable of identifying the top cohort of models with the lowest RMSD of ~ 2.7 Å, and the supervised ML was able to identify a similar cohort and correctly score to the top 10 models in terms of RMSDs with a Pearson score of 0.84 for the whole trajectory (Fig. 4d).

Validation using data from structures generated from SAXS
BM4 and BM5 were chosen to demonstrate the capability of recapitulating structures where the initial models were generated using low-resolution experimental techniques. For this purpose, we used SAXS-derived models of BM4 (395 nts) and BM5 (298 nts) 43 , whose initial models were 10 16.1 and 14.0 Å from their GT crystal structure models in terms of RMSD 35,36 . During the dynamic fitting, both RNAs showed a wide energy landscape; however, unsupervised ML was able to select a cohort containing models with a minimum RMSD of 3.8 Å for BM4 and 5.4 Å for BM5 ( Fig. 3). The SML performed well in scoring the models, especially in predicting the accuracy of the models of top cohorts with Pearson coefficients of 0.77 and 0.64 for BM4 and BM5, 15 respectively ( Fig. 4f,g). Considering the results from all benchmark tests, we conclude that HORNET is capable of estimating the accuracy of recapitulated structural models of the top selected cohorts in terms of RMSD. Predicted RMSD becomes less accurate beyond ~7.5 Å, which may be explained by both energetic and topographic equivalency among conformers, or the underrepresentative pseudo structure database (Fig. 4i). 20

Determining structures of conformers using experimental AFM images
One ultimate goal is to recapitulate the structures from experimental AFM images of individual RNA molecules with unknown conformations (Figs. 2a and 5). The AFM images of the three particles showed discernable structural features, and the layouts of the structural elements in each particle were distinctly different from one another and from the crystal structure 36 , evidenced by their CC AFM scores, of 0.77, 0.80 and 0.87 for AFM particles P1, P2 and P3, respectively (Fig.   2C). P1 had the lowest CC AFM score, suggesting a significant deviation from the crystal structure. Thus, these three particle images capture three different conformations along the folding landscape 5 (Fig. 2a). The dynamic fitting of the crystal structure for P1, P2 and P3 shows significantly different conformational landscapes ( Fig. 5a and Extended Data Fig. 23). P1 samples a vast area of atomic displacements, while P3 had an intermediate sampling. P2 had the most restricted displacements, likely due to the presence of more short-distance information in the P2 AFM image ( Fig. 2e) than the others. The 3D dynamic fitting trajectories were analyzed by unsupervised ML 10 and DNN (Extended Data Tables 7 and 8). The best-recapitulated models for P1, P2 and P3 were between 4-6 Å RMSD ( Fig. 5b and Extended Data Table 7). The unsupervised ML and DNN select models in similar ranges of energies and CC AFM (Fig. 5c, Extended Data Table 8). The DNN shows a wide range of predicted RMSDs for P1 from 5.6-31Å, while P2 and P3 showed narrower distributions to smaller values, with a minimum RMSD of 4.4 and 4.7, respectively (Fig. 5c). This 15 shows that the trained model is capable of identifying and scoring different ranges of atomic displacements. The RMSDs between the top recapitulated model for each particle and crystal structure, excluding residues with missing electron density in the crystal structure were ~15, ~11, and ~7 Å, for P1, P2, and P3, respectively. Rg of the crystal structure is 45.1 Å, indicating a more compact structure, similar to that of the P3 conformer (45.8 Å). Indeed, P3 most resembles the 20 crystal structure by visual inspection (Fig. 5b). On the other hand, P2 appears to be more extended (49.0 Å), and P1 is more compact than the others (41.0 Å). The RMSF between the crystal model and the recapitulated structures shows that major fluctuations occur in the regions where the crystal structure could not be modeled due to insufficient electron density (Figs. 5d and 2a,c).

Fig. 5|
Recapitulating three topological structures from three experimental AFM particle images. a, Contour maps of the total energy and CC AFM . The green color gradient indicates the population density, 5 and the symbols indicate the cohort models selected by unsupervised ML (blue) and the top 10 models scored by DNN (red) for P1 (left, total 4 million models), P2 (middle, total 4 million models) and P3((right, total 2 million models). b, The top structural selected models rendered in molecular surface mode. c, DNN RMSD prediction for the full trajectory (green), and evaluation for the selected cluster and cohort from unsupervised ML for P1, P2 and P3, respectively. d, RMSF profiles between the crystal model and the 10 recapitulated models of P1 (black), P2 (cyan) and P3 (orange). Residues with missing electron density in the crystal structure are highlighted in grey.

Discussion
Our method addresses major challenges in studying topological structures of highly heterogeneous 15 and flexible RNA molecules by obviating the dependency on signal-averaging. The ability to recapitulate topological structures from AFM images of individual RNA molecules could drastically expand our knowledge of the heretofore uncharted RNA 3D conformational space 10 , far beyond the few snapshots of static structures in databases. Given that more than 80% of the human genome is transcribed into RNA 44 , more than 80% of which contain structural elements, HORNET has the potential to accelerate our understanding of the topological structures of large 5 RNA with known biological significance.
Estimating the accuracy of an unknown structure is a grand challenge in structural biology 45 .
Recent success in protein 3 and progress in RNA structure prediction 16 is highly encouraging.
Because of the conformational heterogeneity of RNA, incorporating individual conformer-specific topographic global restraints for the recapitulation and accuracy evaluation of heterogeneous 10 structure models using DNN is a viable approach to chart the highly heterogeneous RNA conformational space. Given a sufficient structure database that covers a full RNA conformational space for DNN training and topographic information, we believe that accurate topological structures of individual RNA conformers in solution can be determined.

Acknowledgments
We thank both R. J. L. Townshend and R. O. Dror for assistance in using ARES; NIH HPC staff for computing resources.

RNA structure calculation applying topographic restraint
Given that duplexes are well-conserved and the predominant building blocks in folded RNA structures by far 23 , they could be considered semi-rigid bodies within a folded RNA structure. Since they are covalently 5 connected, these duplexes can be treated as kinematic chains. Adding kinematic constraints between rigid bodies will significantly decrease the degrees of freedom of a rigid body system 46 , and imposing the topographic constraints in addition to the kinematic constraints further reduces the degrees of freedom of sampling space.

10
A high-resolution AFM image is more than just a "frame" of a molecule. The width and pitch of an A-form RNA duplex are ~25 and 30 Å, respectively, which are on a similar scale to a sharp AFM probe and sensitive to detection. Thus, given an achievable imaging resolution of 10-15 Å 21 (Fig. 2), major structural features such as grooves and pitches of long duplexes along with molecular shapes and topological folds of larger where total energy is the system energy for the molecule;  ∑ ∑ is the covalent energy term that includes bond lengths, angles, and dihedrals: whereas the noncovalent energy term ∑  ( ) ∑ includes stacking, base-pairing, short-and long-25 range interactions, specifically van der Waals and electrostatic interactions:  AFM ,  c ,  stacking ,  pairing and  contact are the scaling factors for AFM, covalent, and secondary structural interactions including stacking, base-pairing, and contacts, respectively. We usually set  contact to unity to avoid bias towards the initial structure, whereas  AFM ,  c ,  stacking and  pairing are empirically determined to achieve the optimal balance between enforcing the integrity of primary and secondary structures (the 5 hierarchical principle) and achieving the best fit to the topological restraints at the same time (Eq. 1 to 3).
In this study, we found that the optimal empirical values are  c = 5,  stacking = 9 and  pairing = 9. The optimal value for  AFM is dependent on the topography of the particle, the noise level of an image, and the closeness of the initial structural model to the "true" structure underneath the molecular surface. We scan through dynamic fitting calculations with  AFM from 2 to 50.

10
In summary, the topological restraint imposed by Eq. 1 to 3 together with the energetics of the primary, secondary and tertiary structures significantly reduces the degrees of freedom that a molecule can sample in a coarse-grained MD trajectory. To perform the coarse-grained MD trajectory, we make used of Cafemol software . A user friendly pipeline and calculation setup is described in the Supporting 15 Information material and code availably at https://github.com/PNAI-CSB-NCI-NIH/HORNET. The challenge now is to identify the structural models closest to the "true" structure underneath the AFM molecular surface.

Unsupervised machine learning 20
Our UML approach assumes that the classical molecular dynamics simulation guided by topographic information can sample the real native conformational space of the RNA and that the correct models can be identified based on the established hierarchical folding principle 50 , energetics 51 and agreement with topographic restraints. Our UML algorithm is able to decipher the underlining correlation of the data set, resulting in the recognition of generalizable models without pre-training or data labeling. Each analyzed 25 data set (trajectory) is unique, and the machine does not have any expected pre-labeled output from a given input. Our UML algorithm consists of three main steps: (i) energy filtering, (ii) principal component analysis (PCA) and clustering, and (iii) cohort model selection.

Energy filtering
This step is performed to remove outliers of unstable conformers generated in the trajectory. The input data is filtered from the whole trajectory (raw data) with m total frames (trajectory structures), and the filter function is applied based on the mean value of each j energy component:  Table 1). 15

Principal component analysis (PCA) and clustering
There are a total of 10 features which include energetic and topographic information that each frame is associated with. We reduce data dimensionality to a small number of components while maintaining the maximum amount of statistical information by applying PCA and clustering. These 10-dimensional vectors 20 are the features describing energetics and topology, the i-th element of the m-th structure { }: The vector l has index m that runs from 1 (first frame) to m (last frame) representing all the structure models of the trajectory and all features, which include energy terms E total , E Go

⋮ ⋮ ]
To derive a linear combination of the 10 features and maximize variance among the components we applied principal component analysis (PCA). In this procedure, we describe our matrix with another linear combination basis. In other words, we can find a transpose vector ′ of constants 1 , 2 , … , 10 5 where: ′ = 1 1 + 2 1 + 3 + ⋯ + 10 1 + ⋯ + 10 (10) However, the Eq. 10 is not a unique solution to describe a linear combination basis of the matrix , other non-correlated solution ′ exists, and continues up to the n th linear function with ′ : with the n-rows of the linear transformation matrix that represents the principal components (PC), is the matrix ( × ) related product of . The variance presented in this linear combination is quantified using the covariance matrix S 53 : Where k and j are combinations of the terms of l (Eq.9). In order to have linear combinations and 15 uncorrelated variables, the covariance matrix needs to be transformed to a diagonal matrix A, that is: P is orthogonal and can be described as a normalized matrix, denotes the transpose of . This last equation can be resolved by calculating the eigenvector of the covariance matrix: =  , where x is a vector of P with eigenvalues , which are the diagonal elements of A. In our analysis, this equation is 20 resolved using singular value decomposition (SVD) 54 . Before applying SVD, we standardize all the features using Eq. 8.
To define the number of components describing the trajectory, a primary screening is performed for each dataset to analyze the cumulative variance versus the number of components. The general profile for this 25 plot can be seen in Extended Data Fig. 1a, and the initial values show a steep increase that reaches a very low variance rate with a flat area for a higher number of components. The chosen cutoff is located at the beginning of the lower cumulative variation rate. Using the PCA-space data matrix the k-means algorithm 55 is applied to separate the data into clusters where each population is defined by the minimalization of the inter-cluster entity distances to the geometric center. A similar strategy is used to define the number of clusters for each trajectory, but now the threshold is defined by the correlation plot between the 5 within-cluster sum of squares (WCSS) and the number of clusters. The number of clusters is decided using the first derivative of the WCSS (Extended Data Fig. 1b), at the intersection between considerable variations and a flatter pattern. The outcome is that the representative cluster with the correct structure population should be the one with the structures containing the lowest native energy values. Using that criterion, the E local , E total and E Go energy distributions are analyzed and the cluster with lower mean 10 distribution is selected.

Cohort model selection
The cohort of models from the trajectory is selected using the representative cluster. The final selection criteria are applied assuming that the best models must be observed at the highest cross- 15 correlation with the AFM topography (CC ) and, at the same time, the energy values need to be populated at the lowest values. A threshold filtering procedure is applied to the representative cluster using the cutoff values described in Extended Data Table 2.

Supervised deep neural network design 20
Based on the question of whether the most fundamental characteristics of models such as their energetics and known topology of a structure contained in the AFM experimental data would provide enough information to consistently determine the RMSD between the structural model and an unknown GT structure, we designed a deep neural network (DNN) 56 to learn how these fundamental characteristics could be intrinsically correlated. 25

Data Preparation and Features
The dense layers of our DNN are connected by passing the information of each layer forward to the next one, known as a feedforward neural network. The training of a DNN relies on the ability to find the best weights ( ) and bias term ( ) in all the layers in a way that minimizes the difference between the true value of the feature to be predicted and the output layer ( ) of this DNN: After the featurization and normalization, we applied a standard scaling so that the distribution used for 25 training (and testing) would be standardized to allow an optimized convergence of the training.
Consequently, all the data used for subsequent RMSD evaluations need to be normalized by the same training normalization parameters to be consistent with the training data and evaluation.
In this study, the output of the DNN is driven to be as close as possible to the RMSD of the training examples by using all the discussed features. The weights and bias for the first layer ( 1 , 1 ) with k neurons will be: where each column represents the weights for a single neuron and each row the weight per feature, while b is the bias term, one per neuron. The data X and the target of the RMSD prediction on the output layer

Loss function
The process of learning in an artificial neural network (ANN) depends on the loss function. The mean squared error (MSE, also known as L2 loss or quadratic loss) and the Huber loss, had both great performances in our training: where ̂ is the prediction for a single training sample and its true value. in the Huber loss sets the region where the loss will assume a squared difference or absolute difference, so it does neither overweight the outliers such as in the MSE loss, nor simplify the loss by the averages. In the training of our DNN model, we used the Adam 57,58 optimizer to minimize the loss function.

Underfitting and Overfitting
To avoid overfitting and to be able to keep increasing the complexity of our ANN we added regularization penalties to the training. Within the known regularizers we evaluated training using Ridge regression (L2 regularizer) and dropout technique. Ridge regression adds a penalty to the loss function term for all the weights squared, preventing the weights to assume too higher values. In the dropout technique, in each 15 step of the training/optimization, some neurons have a given (set) chance to be turned off. We also tested increasing the size and variety of the dataset by adding more data (trajectories) -see Extended Data Table   3.

Optimized architecture 20
To train the DNN and access its performance, we split dataset X (which contains only one kind of RNA) into 2 parts: the training, and the initial training test set, where we could check the regularization effect over the same trajectory and assure that the regularization was blocking the train to overfit the trajectory over the split, hence providing similar loss on both sets. The training set had 80% of the m data examples, while this initial training test set had 20%. The optimized dataset that yields the best performance was 25 built using the whole data from the BM 0 data, with an addition of 5% of data from trajectories of the BM1 and BM2 each (Extended Data Fig. 2). The validation set was created by using a different RNA trajectory simulation, BM5, so that the best loss on the validation set would point to the place where the training and learning with a given RNA was still generalized to another RNA or trajectory, applying early stopping on the evaluated loss considering RMSDs up to 10 Å to weight a better performance on smaller RMSDs than larger ones. Hence, the validation set was used for both tuning the hyperparameters and for selecting the best-trained model, while further tests over the benchmarks address if our model can 5 generalize its findings and learnings to other RNAs not contained in the training data, with different RNA sizes, assessing what would be the real performance of our model to other unknown RNAs and trajectories than the one used for training.
We optimized the architecture for this work by many step-by-step random searches and subsequent fine-10 tuning of the hyperparameters, which include the number of layers, the number of neurons per layer, weight initialization, neuron activations, regularization penalties and types, the optimizer algorithm as well as the learning rate. Additionally, more than 50 different compositions of the training dataset were also used for training models (Extended Data Table 3). The number of hidden layers tested (also by a The non-linear activations tested were relu, leaky-relu, elu and gelu for each layer separately, or a selu 59 activation set for all layers. For regularization, each layer could use either the Ridge regression and/or a Dropout 60 chance (for selu the Alpha Dropout 59 was used instead of Dropout to keep the self-normalizing 25 properties). Our optimized architecture has only 3 hidden layers with a decreasing number of neurons, 128 in the first layer, 64 in the second, and 16 in the third, using elu activation with a common dropout rate of around 20% as the regularizing agent. Deeper networks also had a good performance, but with the cost of many weights to train without clear improvement. The total number of trainable parameters with the current architecture is around 11k. Within initializations, we tested Glorot uniform, Lecun normal and 30 He normal, with the latest achieving the best performance as the weight initializer and using Adam as the optimizer algorithm with a standard learning rate of 0.001, with the mini-batch size of 128 and using Huber loss. Table 3 summarizes all the hyperparameters for searching and tuning, as well as different datasets and data filters applied to create dataset X as input to the models. The models were trained using 5 NIH-HPC (Biowulf) k80/k100x nodes:

Extended Data
•  20 filter device (Millipore). Purified RNA was subjected to several buffer exchanges using a Centricon unit (Millipore) with 30 kDa molecular weight cut-off membrane against refolding buffer (50 mM MES buffer pH6.8, 100 mM KCl, 1 mM MgCl2), then concentrated to 2 M, aliquoted, and stored at -80 °C before utilization. 25 For AFM experiments, the RNA sample at 2 M concentration was annealed in the refolding buffer (50 mM MES buffer pH6.8, 100 mM KCl, 10 mM MgCl2) at 65°C for 2 min followed by stepwise cooling to 37°C over 30 min, and then kept at 4°C before AFM measurements. To dilute the RNA sample to the required concentration (20 nM) for AFM, 1:100 volume of low-salt buffer (50 mM MES buffer pH 6.8, 10 mM KCl, 1 mM MgCl2 (preequilibrated at 4°C) was used, and the 30 diluted sample was immediately deposited onto mica pre-treated with 1-(3-aminopropyl) silatrane (APS) for immobilization 61 .

Experimental AFM topography
Experimental AFM image acquisition 5 The detailed procedure for the AFM image acquisition is described elsewhere 34 . Here is a brief outline of the procedure. P1, P2 and P3 particle images (Extended Data Fig. 3) were recorded under the solution conditions described above using a Cypher VRS AFM (Asylum Research, Oxford Instrument) at 4 ℃ with amplitude-modulated AC mode at a scan rate of 1 Hz (commonly known as tapping mode) using FASTSCAN-D-SS probes (Bruker, CA). For RNA immobilization, 50 mM APS stock was freshly diluted 300-10 fold in deionized water right before use and then used to coat a freshly cleaved muscovite mica (Grade V1) (Ted Pella Redding, CA) and incubated for 30 min, followed by rinsing the mica surface with deionized water and drying gently with filtered nitrogen gas.

Image processing 15
The detailed procedure for AFM image processing is described elsewhere 34 . Briefly, raw images were first processed using SPIP (Scanning Probe Image Processor) software: plane leveling to the particle-free region by applying 3 rd -order polynomial, followed by spike filtering to remove artifact streaks, and FFT (Fast Fourier Transform) to remove high-frequency noise (Extended Data Fig. 3). The final image resolution was increased to 4096 x 4096 pixels by doubling the number of pixels twice. Single-particle images were 20 cropped from the processed images and converted to pseudo-AFM (*.txt) with a digital resolution of 5-Å per pixel in MountainsSPIP software for structure calculation.

AFM noise estimation
For quantification of the noise present in the Z coordinate, we used the cropped single molecule from the 25 full recorded AFM topography as input, and the Z values were collected for defined x-and y-coordinates of the "empty" area around the molecule. The Z-coordinate values of the empty horizontal and vertical spaces can be described by a normal function, where the mean value of this distribution represents the mean noise and the uncertainty as the standard deviation (sigma). The mean noise value and uncertainty were evaluated for P1, P2 and P3 before and after image processing (Extended Data Fig. 4). In this analysis, we are considering all the experimental sources that result in noise randomly distributed over all recorded data as a background signal. 5 The topography resolution assessment was performed using an auto-correlation value (ACV) approach 21 .

AFM resolution estimation
There are two principal steps to be performed in this method. First, using the processed image, we calculate the 2D Fast Fourier transform (FFT) of the AFM image and a defined ring-size (pixels) cut-off is applied to select a portion of the image in Fourier space. Afterward, the image is back-calculated to real space; this step is described as a low-pass filtered Fourier ring. In the second step, we calculate the ACV 10 between the original image (R) and the resulting one from the Inverse Fast Fourier Transform (IFFT) for each of the low-pass filtered rings ( ′ ). The comparison between the original image with its resulting image from the low-pass filter is performed using the auto-correlation equation (Eq.17). A loop interaction was applied starting from low to high frequency, where the ACV starts from low correlation values up to values near to 1.0 where the low-pass cut-off is close to the particle dimension in real space. In where i is the (x,y) pixel position with a Z-height (R) of the original reference image and a Z-height (R') after applying a low-pass filtered Fourier ring; and ′ represent the mean values of the Z-heights of reference and low-pass filtered images, respectively. 25

Benchmark 0
The initial structural model as Benchmark 0 was selected based on the best ARES score of 9.9 16 (Extended Data Fig. 6) from a pool of models calculated using Cafemol 32 . The RMSD between this structure and the ground-truth (GT) structure is 21.4 Å. The GT structure was generated based on the crystal structure of the catalytical domain of RNase P RNA, denoted as trRNaseP RNA 31 (accession code: 3dhs). The missing electron densities of the structure were modeled using SimRNA 62 and further refinement using Coot 63 . 5 AFM images of the GT model were calculated using a resolution of 5.0 Å per pixel, added with seven different simulated Gaussian noise levels, i.e., 5,10,15,20,30,40 and 50% of maximum Z-height (Extended Data Fig. 7).

10
The dynamic fitting using k158597 as initial and the AFM topography of the GT was performed for all noise levels (Extended Data Fig. 7) using the common configuration described in Supplementary Information for a total of 20x10 6 steps (~0.9µs).

Benchmarks 1 and 2 15
This benchmark was designed aiming to tackle unknown 3D structure models while the primary sequence and secondary structure information is known. For this task we first applied FARFAR2 64 -rna_denovo application generating 8000 structure models of truncated RNase P RNA (trRNase P) using the primary sequence, secondary structure (Extended Data Fig. 9), and atom pair distance constraints of the wellknown loop interaction L15.1-L5.1 described in detail previously 65 (Extended Data Table 4). For structure 20 refinement the minimize_rna function was applied as potential during the FARFAR2 structure prediction run, using parallel jobs on a 28-core 2.3 GHz x2695 processor.
The FARFAR2 scoring function was calculated for all the predicted models and analyzed as a function of the main energy terms. The sampled refined structures show a range of RMSD with a maximum of 46 Å 25 and a minimum of 14 Å from the crystal model (PDB id 3dhs). We selected three models from all predicted structures from FARFAR2 using the following criteria: one model with the best ARES prediction (S257), one being located in the region of both ARES and FARFAR2 low scores and an RMSD from the GT structure of at least 20 Å (S1076), and one model with the lowest RMSD (S142), Extended Data Fig. 10.
ARES selected model S257 as the best model from the FARFAR2 ensemble, a model that presents dramatically different folds from the crystal model, and an RMSD of ~30 Å (Extended Data Fig. 10b). Using an RMSD threshold of 20 Å and scoring the models using the energetic scoring function of FARFAR2 and 5 final score of ARES, the best model was S1076 with an RMSD of 20.0 Å, where this model shows a folding similar to the crystal model (Extended Data Fig. 10b).

Benchmark 3
We applied our method to the adenosylcobalamin riboswitch (Cbl) crystal model (accession code: 4gma), 10 which has a folding and size (210-nt) different from the RNA used in training and Benchmarks 1 and 2.
The structure calculation was performed with a total of 20x10 6 steps (~0.9µs) using an AFM-topography generated with 5 Å per pixel (Extended Data Fig. 11). The final trajectory generates ~6.6 million models that were analyzed using UML and DNN (Figs. 3 and 4).

Benchmark 4 and Benchmark 5
These two benchmarks were designed to test our method using the initial models determined by lowresolution experimental data. In this case, we used the topological structures of RNase P RNA (298-nt) and group II intron (387-nt) 43 determined by using the secondary structure and small-angle X-ray scattering (SAXS) data 37 . The GT AFM images were calculated using the crystal models for the RNAs with a resolution 20 of 5Å per pixel, pdb id 2A64 36 and 4E8K 35 , respectively (Extended Data Fig. 12). The structure determination was performed using a trajectory with a total of 60x10 6 steps (~2.7µs). The final trajectory generates ~13.4 million models that were analyzed using UML and DNN ( Fig. 3 and 4). 25